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PREFACE 


In  February  1992,  die  Current  Operations  Branch  of  Air  Force  Global  Weather  Central 
(AFGWC/DOO)  requested  a  comparison  study  of  upper-air  quality  control  (QC)  methods 
used  by  AFGWC  and  the  National  Meteorological  Center  (NMC).  The  request  stemmed 
from  an  offer  by  NMC  to  provide  rawinsonde  observations  (raobs),  quality-controlled  by 
their  algorithm,  to  AFGWC.  Since  AFGWC  now  QCs  and  corrects  its  own  raobs,  the 
advantages,  disadvantages,  differences,  and  any  added  value  of  each  correction  scheme 
had  to  be  determined  before  accepting  the  offer.  The  Simulations  and  Techniques  Branch 
(SYT)  at  USAFETAC  completed  the  comparison  under  project  number  920313.  The 
author/analyst  was  Capt  David  J.  Speltz,  who  wishes  to  thank  Dr.  William  G.  Collins  of 
NMC  for  the  wealth  of  information  he  provided  on  the  CQCHT  algorithm,  as  well  as  for 
output  from  the  program. 
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1.  INTRODUCTION 


1.1  PupoM  of  Stud/.  This  study  compares  the 
output  of  the  upper-air  quality  control  (QC) 
methods  used  by  the  Air  Force  Global  Weather 
Central  (AFGWC)  with  those  of  the  National 
Meteorological  Center  (NMC).  For  its  upper-air 
QC,  AFGWC  uses  the  New  Upper-Air  Validator 
(NUAV)  which  became  operational  on  22 
December  1986  (Zamiska,  19^).  NMC  uses  the 
Complex  QC  procedure  for  Rawinsonde  Heights 
and  Temperatures  (CQCHT)  algorithm,  which  has 
been  operational  since  November  1991  (Collins, 
1991).  This  study  identifies  advantages, 
disadvantages,  and  any  added  value  of  each 
correction  scheme. 

1.2  Data  Used.  At  the  end  of  each  month 
summaries  of  QC  results  were  produced  for  both 
NUAV  and  CQCHT.  NUAV  results,  stored  in  the 
DATSAV2  data  format,  were  obtained  through  the 
Climatic  Operations  Branch  (GCO)  of  OL-A, 
USAFETAC,  in  Asheville,  NC.  The  CQCHT  data 
was  provided  by  Dr.  William  G.  Collins  of  NMC 
in  Washington,  DC.  Data  for  the  months  of  July 
and  November  was  used  for  this  study. 

1.3  Methodology.  Samples  of  observations  that 
had  been  QC’ed  by  both  algorithms  were  selected 
at  random.  Each  error  in  the  samples  was 
examined  manually  and  categorized  based  on  their 
characteristics.  Conclusions  were  drawn 


from  the  number  of  observations  in  each  category. 
Bulk  statistics  describing  the  output  for  each 
algorithm  were  also  examined.  Finally,  the 
advantages  and  disadvantages  of  each  algorithm 
were  compared  subjectively  to  determine  which 
was  more  effective. 

1.4  DMIctMes.  Although  both  algorithms  try 
to  achieve  the  same  thing  (to  correct  or  at  least 
detect  incorrect  observations),  differences  in  the 
methods  used  by  each  complicated  comparisons. 
For  example,  since  NMC  uses  a  more  stringent 
cutoff  time  than  AFGWC,  more  observations  get 
into  the  AFGWC  database  than  into  NMC's.  This 
leads  to  problems  in  determining  whethr  CQCHT 
missed  an  obvious  error  or  just  simply  never 
checked  the  station  in  question  at  all.  The  CQCHT 
output  provides  much  more  information  about  the 
nature  of  the  error  than  NUAV;  this  NUAV 
shortfall  often  makes  errors  detected  by  NUAV 
difficult  to  evaluate.  These  were  just  a  few  of  the 
many  problems  encountered  in  attempting  a 
comparison  of  this  type. 

1.5  Results.  A  careful  search  of  the  literature, 
along  with  comparisons  of  2  months  of  data 
processed  by  NUAV  and  CQCHT,  show  CQCHT 
to  be  the  better  algorithm.  Not  only  does  it  detect 
more  errors,  but  it  generally  makes  more 
reasonable  corrections  as  well. 
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2.  COMPARING  QUAUTY  CONTROL  ALGORITHMS 


2.1  New  Upper-Air  VaUelor  (NUAV). 

Automated  QC  of  weather  data  has  been  a 
necessity  since  the  beginning  of  the  age  of 
numerical  weather  prediction  in  the  mid-19S0s. 
Automated  QC  methods  have  come  a  long  way 
since  then,  and  they  continue  to  be  improved 
every  year  by  the  various  numerical  weather 
prediction  centers.  AFGWC  recognized  the  need  to 
update  their  QC  system  in  the  late  1970s  and 
began  work  on  the  upgrade  in  1982.  Among  the 
many  problems  with  the  old  algorithm  were  a  lack 
of  sensitivity  in  the  height  and  temperature  checks, 
misinterpreted  thickness  checks,  and  errors  in 
processing  and  storing  data.  NUAV,  which  became 
operational  on  22  December  1986,  solved  many  of 
the  problems  (Zamiska,  1990). 

2.2  Complex  QC  Procedure  for  Rawinsonde 
Heights  and  Temperatures  (CQCHT).  The 

National  Meteorological  Center  (NMC)  began 
designing  a  new  QC  system  from  scratch  in  1988 
(Gandin  and  Collins,  1992).  This  system,  which 
became  operational  in  early  1989,  was  called 
Comprehensive  Hydrostatic  Quality  Control 
(CH^).  It  comprised  two  major  parts;  (1)  The 
statistical  checks  that  produce  numerical  residuals 
and  (2)  the  Decision  Making  Algorithm  (DMA) 
that  analyzes  the  residuals  before  reaching  a 
decision.  The  DMA  tries  to  determine  the  origin  of 
each  error  and  correct  it  rather  than  simply 
rejecting  it. 

CHQC  was  the  first  QC  algorithm  in  this  country 
to  apply  this  approach  (Gandin  and  Collins,  1992). 
An  advanced  method  (called  “Complex  Quality 
Control,”  or  “CQC”)  has  been  in  use  at  the 
Hydrometeorological  Center  in  Moscow  since 
1979  (Gandin,  1988). 


Dr.  Lev  Gandin,  formerly  of  the 
Hydrometeorological  C-.'nter  and  now  at  NMC, 
was  instrumental  in  bringing  the  CQC  concept  to 
the  United  States. 

CtJCHT  replaced  CHQC  in  November  1991. 
C(^HT  is  similar  to  CHQC,  but  it  includes 
several  additional  statistical  checks  and  uses  a 
more  advanced  DMA.  These  upgrades  allow 
CQCHT  to  make  more  corrections  automatically. 
Table  1  shows  the  types  of  errors  that  CCJCHT  can 
automatically  detect  and  correct.  In  contrast, 
CHQC  only  performed  corrections  on  Types  1  and 
2  and  7  through  10.  Not  only  are  more  ccurections 
possible,  but  a  higher  degree  of  confidence  is 
placed  in  each  correction. 

Table  1.  Errors  that  CQCHT  automaticaly  delects 
aixt  oorrecls  (Gandn  and  Coins,  1992). 

Type  Enor 

1  Large  height  error  at  an  intermediate  ievei  (not  the 
highest  or  lowest) 

2  Large  temperature  error  at  an  intermediate  level 

3  Errors  in  height  arxl  temperature  at  the  same  level 

4  Error(s)  in  height  and/or  temperature  at  the  lowest 
reported  level 

5  Error  in  either  height  or  temperature  at  the  highest  level, 
or  error  in  both 

6  Computational  error  in  layer  thickness 

7  Errors  in  heights  of  two  adjacent  layers 

8  Errors  in  temperatures  of  two  adjacent  layers 

9  Adjacent  errors  in  height  below  and  temperature  above 
to  Adjacent  errors  in  temperature  below  and  height  above 
11  Medium-size  height  error  at  an  intermediate  level 

t3  Data  hole  including  upper  Part  A  levels 
t4  Data  hole  (Afferent  from  type  1 3  error 

22  Medium-size  temperature  error  at  an  intermediate  level 

too  Surface  pressure  or  station  elevation  error 

(communications-related) 

tot  Height  error  in  lowest  level  when  its  temperature  is 
missing 

102  Undetermined  error  in  the  lowest  level  (no  correction 
made) 

106  Obsenrational  error  in  the  surface  pressure 
116  Computational  error  in  height  of  the  lowest  level 
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2.3  Quatty^onlrol  Check  Summery.  A 

summary  of  major  checks  on  upper-air  data 
accuracy  used  by  NUAV  and  CQCOT  follows. 

2  J.l  Hydrostatic  Check.  This  check,  used  in  both 
algorithms,  is  the  most  powerfiil.The  hydrostatic 
check  is  based  on  the  redundancy  of  reported 
heights  and  temperatures  in  the  rawinsonde  data. 
Rawinsondes  do  not  measure  heights  directly; 
heights  are  calculated  from  measured  temperatures 
and  pressures  using  the  hydrostatic  equation.  The 
thickness  of  each  layer  may  be  calculated  by  either 
determining  the  difference  in  heights  of  the 
boundaries,  or  by  using  the  measured  temperatures 
and  pressures  in  the  hydrostatic  equation.  The 
difference  between  these  two  thickness  values  is 
called  the  “hydrostatic  residual,"  which  should  be 
zero  or  near-zero  since  the  hydrostatic  equation 
was  used  to  compute  the  heights  in  the  first  place. 
If  the  values  do  not  agree  hydrostatically,  there  is 
an  error  in  one  of  the  following  areas: 

•  Computation  at  the  observation  location 

•  Data  entry  (e.g.,  digits  transposed) 

•  Data  transfer 

•  Decoding  of  the  data 

Both  NUAV  and  CQCHT  use  the  magnitude  of 
the  hydrostatic  residuals  to  detect  errors,  as  well  as 
to  help  determine  what  corrections  to  make. 
Observational  errors,  like  those  resulting  from  a 
broken  sensor,  are  NOT  detected  by  this  method. 

2.3.2  Increment  Check.  An  "increment”  is  defined 
as  the  difference  between  the  reported  value  and 
its  forecast  “first  guess.”  The  first  guess  is  a 
6-hour  forecast  from  a  numerical  model.  The 
increment  check  is  performed  on  height  and 
temperature;  some  form  of  it  is  used  by  both 
methods.  CQCHT  uses  the  value  of  the  increment 
in  statistical  checks,  while  NUAV  flags  suspected 
observations  in  which  increments  exceed 
predetermined  numerical  limits.  It’s  important  to 
note  the  distinction  between  the  quantitative  way 
in  which  CQCHT  uses  this  check  and  the 
qualitative  flagging  performed  by  NUAV.  The 
value  of  the  increment  check  lies  in  its  ability  to 


[Mx>vide  additional  information  to  confirm,  reject, 
or  refine  the  findings  of  the  hydrostatic  check. 

2.3J  Horizontal  Check.  This  check  uses  the 
increments  of  the  four  nearest  stations,  each  in  a 
different  quadrant.  From  these  four  increments  the 
value  of  the  point  in  question  is  interpolated.  If  the 
interpolated  increment  differs  greatly  from  the 
calculated  increment,  then  the  data  (temperature  or 
height)  is  considered  suspect.  Only  NMC's 
C(3CHT  employs  this  check. 

2.3.4  Vertical  Check.  This  check  is  performed  in 
a  manner  similar  to  the  horizontal  check,  but  now 
the  size  of  the  vertical  residual  is  examined.  The 
vertical  residual  is  the  difference  between  the 
increment  (height  or  temperature)  at  the  level  in 
question  and  the  increment  value  interpolated  from 
the  mandatory  levels  above  and  below  this  level. 
NUAV  does  not  use  this  form  of  vertical  check, 
but  it  does  employ  a  temperature  validation  using 
lapse  rates.  If  the  lapse  rate  for  a  particular  layer 
of  the  sounding  is  outside  predetermined  limits  set 
by  OL-A,  USAFETAC,  steps  are  taken  to  reject  or 
correct  the  temperature(s)  causing  the  problem. 
C(^HT  also  examines  lapse  rates  to  ensure  that 
temperature  corrections  are  not  excessive. 

2.3.5  Baseline  Check.  This  is  essentially  a 
hydrostatic  check  for  the  layer  between  the  surface 
and  the  lowest  reported  mandatory  level.  The 
thickness  between  the  two  lowest  mandatory  levels 
is  used  to  compute  an  average  temperature  from 
which  the  temperature  profile  of  the  lowest  layer 
is  computed  by  extrapolating  downward  to  the 
surface  pressure  using  the  standard  lapse  rate 
(6.5°  C/km).  These  assumptions  are  then  used  to 
solve  for  station  elevation,  which  is  compared  to 
the  official  station  elevation.  Large  discrepancies 
between  the  two  values  indicate  1  .OOO-mb  height 
errors  or  an  incorrect  “official”  station  elevation 
(Collins  and  Gandin,  1990).  Both  algorithms  use 
some  form  of  this  check. 

2.3.6  Gross  Error  Check.  NUAV  uses  a  list  of 
maximum  and  minimum  values  of  height  and 
temperature  at  mandatory  pressure  levels  to  detect 
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values  that  should  be  suspected  (m*  rejected.  This  is 
one  form  of  the  wide  plausibility  cireck,  which  is 
relatively  simple  to  design  and  apply,  but  the  CQC 
method  gets  the  same  results  and  much  more.  For 
these  reasons  Dr.  Gandin  considers  it  “hardly 
worthwhile  to  use  any  check  of  plausibility” 
(Gandin.  1988).  In  addition,  gross  error  checks  are 
not  capable  of  making  confident  corrections  when 
used  alone.  Despite  these  limitations,  NUAV  is 
able  to  detect  numerous  errors  with  this  check. 
Since  temperature  errors  often  result  from 
switched  signs,  the  sign  is  switched  for  any 
temperature  within  10°  C  of  zero  and  the  new 
temperature  is  checked  again.  Flags  are  set  when 


a  value  is  suspect  or  rejected.  These  flags  are  later 
used  to  determine  the  overall  quality  of  the 
sounding  and  whether  it  should  be  used  or  ruM. 

2.4  Disunion  of  Compoiliono.  Although  both 
algorithms  use  the  powerful  hydrostatic  check,  the 
addition  of  increment,  horizontal,  and  vertical 
checks  to  CQCHT  allow  it  to  detect  (and  often 
correct)  additional  errors.  The  added  value  of  these 
additional  checks  is  illustrated  in  Table  2.  This 
data,  from  the  June  1992  CCJCHT  summary, 
shows  stations  suspected  of  Type  22  errors 
(medium-sized  temperature  errors). 


Table  2.  CQCHT  increment,  horizontal,  and  vertical  checks  of  temperature  (T).  June  1 992  results 
are  shown  for  two  stations.  Hydrostatic  residuals  are  for  the  layers  above  and  below  the  layer  in  question. 
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Hailar, 

China 

200 

-50.1 

-57.1 

-4.2 

-15 

10.6 

10.0 

98 

Kupung, 

Indonesia 

50 

-68.1 

NO 

CHG 

4.3 

56 

1.5 

1.4 

19 

Both  stations  are  suspect  due  to  the  large 
hydrostatic  residuals  in  the  layers  above  and  below 
the  level  in  question,  but  in  the  case  of  station 
Kupung  (El  Tari),  Indonesia,  this  suspicion  is  not 
confirmed  by  the  other  checks.  Although  Kupung 
had  larger  hydrostatic  residuals  than  Hailar,  the 
small  size  of  the  other  checks  showed  that  a 
temperature  error  was  very  unlikely.  Note  the  large 
size  of  the  increment  and  spatial  residuals  for 
Hailar;  these  confirm  the  error.  NUAV  would  not 
have  been  able  to  make  a  confident  determination 
in  this  case. 

As  discussed  earlier,  CQCHT  uses  the  baseline 
check,  in  combination  with  others,  to  detect  and 
correct  additional  errors.  Error  types  over  100 
(Table  I)  are  those  detected  with  the  aid  of  the 
baseline  check.  NUAV  can  correct  some  Type  100 
errors  (e.g.,  surface  pressure)  by  switching  digits 
or  adding/subtracting  100  to  obtain  a  better 
pressure. 


The  previous  version  of  the  NMC  QC  algorithm 
(CHQC)  implemented  in  late  1988  used  only  a 
hydrostatic  check  somewhat  similar  to  the  one 
NUAV  uses.  Dr.  William  Collins,  who  works  with 
QC  algorithms  at  NMC,  expressed  the  following 
opinion  about  CHQC  (Collins  and  Gandin,  1990): 
“It  would  hardly  be  possible  to  substantially 
improve  the  CHQC  version  now  in  operational  use 
at  NMC.  Further  progress  may  be  achieved  only 
after  some  other  statistical  checks  have  been 
developed  and  added  to  the  hydrostatic  one.”  This 
goal  was  accomplished  when  CQCHT  became 
operational  in  November  of  1991. 

Because  NUAV  also  lacks  statistical  checks,  its 
performance  is  probably  no  better  than  the  recently 
replaced  CH(^.  The  addition  of  other  checks  has 
indeed  improved  the  performance  of  the  NMC 
algorithm.  A  study  of  15  observation  periods  in 
December  1991  found  that  C(^HT  detected  an 
average  of  26  more  errors  (81  versus  55)  and 
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confidently  corrected  twice  as  many  errors  (38 
versus  24)  during  each  period  as  CHQC  (Morone 
et  al.,  1992).  It  af^ars  likely  that  NUAV  would 
perform  no  better  than  CHQ*J  since  both  lack  the 
spatial  and  quantitative  increment  checks  to  help 
make  their  determin''‘!ons. 

The  strength  of  CQCHT  lies  in  the  way  the  results 
of  the  various  checks  are  expressed  and  interpreted 
(nandin  and  Collins,  1992).  The  results  of  each 
check  are  expressed  quantitatively  in  the  form  of 
residuals,  rather  than  with  flags  like  NUAV  uses. 
The  DMA  analyzes  the  magnitude  and  pattern  of 
these  residuals  before  making  a  quality  control 
decision.  This  allows  the  DMA  to  determine  the 
origin  of  the  error  in  most  cases  and  to  correct  the 
error  whenever  possible.  CQCHT  produces  a 
printout  of  each  error  with  the  corrections, 
hydrostatic  residuals,  increments,  and  spatial 
residuals.  This  makes  the  confirmation  of  errors 
much  simpler  than  with  NUAV,  which  only 
produces  the  old  and  new  values,  a  validation  flag 


showing  which  check(s)  the  observ^on  failed,  and 
an  enervation  quality  indicator.  In  nearly  5  years 
separating  the  start  dates  of  CQCHT  and  NUAV, 
there  have  clearly  been  a  number  of  advances  in 
quality  controlling  weather  data 

A  final  example  of  the  value  of  CQCHT’s 
additional  checks  is  in  the  area  of  observational 
errors,  which  usually  result  from  faulty 
temperature  sensors.  Since  the  heights  of  the 
mandatory  levels  are  computed  from  the 
temperature  profile  (faulty  in  this  case)  using  the 
hydrostatic  equation,  and  not  from  independent 
height  measurements,  the  hydrostatic  check  will 
not  detect  observational  errors.  The  temperature 
errors  as  well  as  the  resulting  height  errors  will  be 
obvious  upon  examining  the  increment  and  spatial 
check  results.  Although  CQCHT  cannot  correct 
observational  errors,  it  can  reject  tiKse 
observations  and  prevent  faulty  data  from  entering 
the  database. 
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Table  3  shows  an  instance  of  an  observational 
error  that  occurred  on  14  July  1992  at  Great  Falls. 
Montana.  Only  the  horizontal  residuals  are  shown 
because  the  increments  and  vertical  residuals  show 
essentially  the  same  effect.  Note  the  small 
magnitude  of  the  hydrosUttic  temperature  and 
height  residuals.  The  horizontal  temperature 
residuals  are  large  and  fairly  constant  above  400 
mb.  The  persistent  positive  temperature  error  leads 
to  dramatic  height  errors  as  well;  note  how  the 
height  residuals  steadily  increase  with  height  as 
the  errors  are  compounded  with  each  level.  NUAV 
did  not  find  any  errors  except  at  the  100-mb  level, 
where  the  temperature  exceeded  the  NUAV  gross 
error  check 


REJECT  limit  shown  in  the  last  column  of 
Table  3. 

All  the  other  heights  and  temperatures  exceeded 
the  NUAV  gross  error  check  SUSraCT  limits  (not 
shown),  but  there  was  apparently  not  enough 
supporting  evidence  available  for  these  values  to 
be  corrected  or  rejected. 

The  case  illustrated  in  Table  3  is  not  a  rare  event; 
it  is  fairly  common  and  has  a  strong  effect  on  total 
error  counts.  During  the  months  of  June-December 
(excluding  September)  1992,  an  average  of  41.8 
percent  of  the  errors  detected  by  CCJCHT  were 
observational  errors. 


Table  3.  CQCHT  delecllon  of  obaervabonai  tampeiature  (T)  and  height  (H)  errora.  The  case  shown 
is  for  Great  Fails.  Montana,  on  14  July  1992.  Hydrostatic  residuals  are  for  the  layers  above  and  below 
the  layer  in  question. 


Level  (mb) 

OfigiiMl  Values 

Hydraetalk  actidualt 

HoiliaaM  Residuals 

NUAV  lUgh  Reject  Umils  | 

T  (•  O 

H(m) 

TCC) 

H  (m) 

T(‘0 

H  (m) 

T(°0 

H{m) 

Abv 

Bla 

Abv 

Bio 

500 

3.0 

5.930 

08 

0.7 

8 

B 

13.4 

187 

17.0 

6300 

400 

-3.7 

7,720 

0.2 

0.8 

2 

8 

19  1 

297 

5.0 

8,100 

300 

-13.5 

9.950 

00 

0.2 

0 

2 

247 

478 

-5.0 

10.300 

250 

-19.3 

11,320 

0.6 

a 

0 

28.9 

620 

-9.0 

200 

-266 

12,960 

-1.0 

0.6 

-10 

6 

31.6 

833 

-130 

150 

-27.8 

15,020 

0.4 

-1.0 

B 

-10 

276 

1.069 

-20.0 

15,300 

100 

-27.1 

17,950 

-03 

0.4 

-3 

B 

294 

1.416 

-280 

18.000 

i 
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3.  METHODOLOGY 


3.1  Dealing  with  Output  DUferenoes.  In  order  to 
conduct  a  valid  comparison  of  the  algorithms,  the 
many  differences  in  the  output  produced  by  each 
must  be  accounted  for,  if  possible.  These  are  the 
major  differences; 

•  The  CQCHT  data  summary  contains  much 
more  information  about  the  error,  the  correction 
made,  and  why  the  observation  was  considered 
suspect  in  the  first  place. 

•  AFGWC  generally  uses  a  more  liberal  data 
cutoff  time,  thereby  allowing  more  data  to  be 
processed  by  NUAV  than  by  CQCHT.  Part  C  of 
the  sounding  (mandatory  levels  70  mb  and  above) 
is  sent  later  than  part  A  (mandatory  levels 
1,000-100  mb).  Therefore,  on  some  occasions  only 
part  A  makes  it  into  the  CQCHT  database,  while 
NUAV  processes  the  entire  sounding. 

•  The  confidence  placed  in  each  correction  is 
expressed  differently  by  each  of  the  two  methods. 
More  will  be  said  on  this  later. 

3.2  Data  Used.  Summaries  of  monthly  QC  data 
for  July  and  November  of  1992  were  obtained 
from  OL-A,  USAFETAC,  and  NMC.  Only 
mandatory  level  data  (1,000,  850,  700,  500,  400, 
300,  250,  200,  150,  100,  70,  50,  30,  20,  and  10 
mb)  was  used.  Only  height  and  temperature  were 
examined.  AFGWC  and  NMC  both  QC  significant 
level  data,  wind  speed  and  direction,  and  dew 
point  (density),  but  QC  in  these  areas  is  much  less 
advanced.  Since  significant  level  data  does  not 
include  height  information,  the  redundancy  used 
by  the  hydrostatic  equation  to  find  errors  is  not 
available.  Lack  of  a  strong  constraint  like  the 
hydrostatic  equation  severely  limits  QC  of  wind 
and  moisture  data  as  well. 

Early  in  this  project  it  became  clear  that  apparent 
lapses  in  QC  were  due  to  the  fact  that  somewhat 
different  datasets  were  being  processed  by  each 
algorithm.  CQCHT  did  not  correct  a  500-mb 
height  from  1 ,460  to  5,460  meters  as  NUAV  did 


simply  because  that  observation  did  not  reach 
NMC  in  time  to  be  in  the  database.  To  solve  this 
problem,  only  stations  processed  by  both 
algorithms  on  a  certain  date  and  time  were  used. 
It  was  also  necessary  for  both  stations  to  have 
Part  C  of  the  sounding  if  errors  were  suspected 
above  100  mb. 

In  addition,  there  were  some  problems  in  the 
DATSAV2  datasets  used  in  this  study.  The  most 
common  problem  was  the  occurrence  of  negative 
height  values  in  the  rejected  data  section  of  the 
output.  Negative  heights  at  the  l,(X)0-mb  level 
occur  at  low  elevation  stations,  but  values  between 
-10,000  and  -70,000  meters  are  commonly 
reported  at  all  mandatory  levels.  During  the 
months  examined,  45  percent  of  the  rejected  height 
values  are  negative,  making  it  difficult  to 
determine  the  validity  of  the  correction  in  many 
cases.  Height  values  greater  than  -1,1(X)  meters 
(the  NUAV  cutoff  value)  at  1,000  mb  were 
considered  acceptable,  but  negative  values  at  other 
levels  were  rejected. 

3.3  Comparison  Methods.  There  are  several 
ways  to  compare  the  error  detection  capabilities  of 
NUAV  and  CQCHT.  Each  has  strengths  and 
weaknesses,  and  each  helps  highlight  differences 
and  similarities.  In  the  first  method,  a  direct 
comparison  is  made  between  stations  with  errors 
picked  at  random.  Each  case  is  examined  manually 
and  placed  in  a  category.  Each  error  is  either 
detected  by  both  algorithms,  only  detected  by 
NUAV,  or  only  detected  by  CtJCHT.  These 
categories  are  broken  down  further,  as  shown  here: 

•  Both  algorithms  detected  error: 

B1  -  CQCHT  correction  better 
B2  -  NUAV  correction  better 
B3  -  Both  corrections  good 

•  Only  CQCHT  detected  error. 

Cl  Correction  good 
C2  Correction  bad 

C3  Error  detected,  but  correction  not  possible 


7 


•  Only  NUAV  detected  error: 

N1  Correction  good 
N2  Correction  bad 
N3  Undetermined 
N4  Correction  unnecessary 

Each  of  these  categories,  and  how  the  proper  one 
is  chosen,  is  discussed  next. 

BI — Both  detected,  CQCHT  better.  The  results  of 
the  hydrostatic  check  and  the  increment, 
horizontal,  and  vertical  residuals  make  determining 
the  validity  of  CQCHT  corrections  relatively 
simple  in  most  cases.  If  the  correction  is  strongly 
supported  by  the  various  checks,  then  it  is  usually 
placed  in  this  category.  Another  piece  of 
supporting  evidence  is  the  making  of  a  simple 
correction.  Most  errors  are  due  to  mistyping  a 
digit,  transposing  digits,  or  a  sign  error  in  the 


temperature.  It  takes  a  simple  correction  to  fix 
one  of  these  errors.  But  how  is  the  category 
determined  if  CQCHT  makes  a  correction  that  is 


strongly  supported  by  all  the  available  evidence, 
and  if  NUAV  makes  a  correction  very  close  in 
magnitude?  Based  on  raob  accuracy  studies 
(Ahnert,  1991),  2.0  mb  is  a  good  average  value  for 
the  root  mean  square  (rms)  of  the  pressure 
differences  between  various  raob  sensors.  Using  a 
2.0-mb  error  and  height  and  temperature  values 
from  the  standard  atmosphere  in  the  hypsometric 
equation  leads  to  height  differences  of  about  0.5 
percent.  If  the  NUAV  height  value  is  within  0.5 
percent  of  the  corrected  (good)  CCJCHT  height 
value,  both  corrections  are  considered  good.  The 
value  used  for  temperatures  is  1.0°  C.  Table  4, 
with  data  from  Alta  Floresta,  Brazil,  on  29  July 
1993,  illustrates  a  case  of  a  correction  being 
placed  in  this  category. 


Table  4.  Bl-Bolh  aigortthms  correct  error,  but  CQCHT  result  better.  Example  from  29  July  1993, 
Alta  Floresta,  Brazil. 


The  hydrostatic  residuals  provide  the  strongest  0.5  percent  of  the  strongly  supported  CQCHT 

support  for  this  correction  by  CQCHT.  The  correction, 

hydrostatic  residual  for  the  layer  below  (700-5(X) 

mb)  is  -1-3,032  meters  and  -2,996  meters  for  the  In  many  cases  CQCHT  does  not  correct  the 

layer  above  (500-400  mb),  consistent  with  the  observation,  but  merely  flags  it  as  incorrect  (see 

500-mb  height  being  about  3,000  meters  too  high.  Category  C3).  In  these  cases  the  statistical 

After  the  correction  is  made,  these  residuals  (32  evidence  is  not  strong  enough  to  make  a  confident 

and  3  meters)  essentially  disappear.  The  other  correction,  but  the  CCJCHT  output  does  provide 

checks  also  suggest  a  positive  error  of  roughly  enough  information  to  determine  a  likely 

3,{XK)  meters.  CQCHT  chooses  the  simplest  correction.  In  this  situation,  the  observation  is 

correction,  provided  it  reduces  the  residuals  the  usually  placed  in  the  “both  corrections  good” 

most.  The  NUAV  correction  is  certainly  better  category,  provided  the  NUAV  correction  is 

than  keeping  the  original  value,  but  the  CQCHT  supported  by  the  CQCHT  output  and  nearby 

data  does  not  support  a  change  of  3,064  meters.  In  soundings, 

addition,  the  NUAV  correction  is  not  within 
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B2—Both  detected,  NVAV  better.  Unfortunately, 
very  little  evidence  is  available  to  support  the 
NUAV  correction  over  the  CQCHT  correction. 
Without  the  extensive  list  of  results  from  the 
various  checks  like  CQCHT  produces,  there  is 
usually  no  reason  to  consider  the  NUAV  result 
better.  Only  a  gross  error  by  CQCHT  (like 
correcting  a  5(X}-mb  height  from  5,S60  to  1,560 
meters)  coupled  with  a  reasonable  NUAV 
correction  would  cause  the  NUAV  result  to  be 
declared  better.  Such  cases  may  occur,  but  they 
are  very  infrequent. 

B3—Both  detected,  both  corrections  good.  As 
explained  earlier,  if  the  CQCHT  correction  is 


considered  good  and  the  NUAV  value  is  within 
O.S  percent  of  the  corrected  height  or  1°  C  of  tl^ 
corrected  temperature  value,  both  ccarections  are 
considered  good.  The  CQCHT  value  is  used  as  the 
basis  for  determining  whether  the  correction  is 
good  or  not  good  simply  because  so  little 
information  is  provided  with  the  NUAV 
corrections.  An  example  of  this  type  of  correction 
is  shown  in  Table  S  (Harare,  Zimbabwe,  23  July 
1992).  All  the  evidence  suggests  a  positive  error  of 
about  2,000  meters.  The  CtJCHT  correction  is 
further  supported  by  the  simple  one-digit  change. 
The  correction  by  NUAV  is  certainly  not  simple, 
but  since  it  only  differs  from  the  CtJCHT  value  by 
0.3  percent,  it  is  also  considered  a  good  correction. 


Table  5.  B3-Both  algorithms  make  good  corrections.  Case  from  23  July  1992,  Harare,  Zimbabwe. 


Meltiod 

Uvel 

(mb) 

Old  H 

(m) 

New  H 

(m) 

n 

Chanfc 

(m) 

Hydnstilic  Residiuds  (m) 

IncRment 

(ra) 

Residoalf  (m) 

BefoR 

Al 

ler 

Harii 

Veit 

Abv 

Bio 

Ab* 

Bio 

NUAV 

200 

14.440 

12.403 

-2,037 

— 

1 

• 

1 

1 

1 

g|! 

CQCHT 

200 

14,440 

12,440 

-2,000 

-1,999 

1.994 

0 

-5 

2,017 

2.007 

2.000  1 

Cl—CQCHTonly,  correction  good.  The  techniques 
already  discussed  are  also  used  to  place  corrections 
in  this  category. 

C2— CQCHT  only,  correction  bad.  There  is  rarely 
any  evidence  to  suggest  that  the  correction  is  bad. 
If  the  evidence  suggesting  the  presence  of  an  error 
is  not  very  strong,  a  correction  is  not  made  and  the 
observation  is  flagged  for  further  analysis  (see  C3 
below).  Because  only  cases  with  the  strongest 
supporting  evidence  are  corrected,  the  chances  of 
a  poor  correction  being  found  is  very  low. 

C3— CQCHT  only,  error  detected,  but  not 
corrected.  Many  (35  percent  is  typical)  of  the 
suspected  errors  are  not  correctable.  These 


observations  are  put  into  one  of  two  error  groups. 
Error  type  3  observations  are  probably  bad  and  are 
passed  to  a  specialist  who  either  rejects  or  retains 
the  value.  Error  type  4  observations  are  definitely 
bad  and  are  automatically  rejected.  The  decisions 
of  the  NMC  specialist  are  not  included  in  the 
CQCHT  output. 

NI—NVAV only,  correction  good.  Errors  placed  in 
this  category  are  essentially  CQCHT  “misses.” 
Little  evidence  is  available  to  help  determine 
whether  the  correction  is  good  or  not.  Nearby 
soundings  are  examined  for  any  large  disparities 
with  the  suspect  value.  Reasonable  changes  in 
gross  errors  are  generally  considered  to  be  good 
corrections. 
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Tafato  6.  N2-Pcxir  COUBCtlona  mada  by  NUAV.  Case  from  I^isher,  Canada.  1 1  July  1992. 
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N2—NUAV  only,  correction  bad.  The  same  factors 
are  used  as  in  determining  if  the  correction  is 
good.  Sometimes  an  error  may  have  been  detected 
by  CQCHT  at  the  level  above  or  below  the  level 
NUAV  suspects.  If  the  CQCHT  correction  2^)pears 
to  be  good,  this  fact  may  rule  out  the  NUAV 
correction.  Table  6  shows  an  example  of  this  from 
Frobisher,  Canada,  on  1 1  July  1992.  The  key  level 
to  look  at  here  is  200  mb,  where  CQCHT  has 
made  a  minus  70-meter  correction  in  the  height. 
This  is  a  Type  6  error  (see  Table  1)  or  a 
computational  error  in  thickness.  The  hydrostatic 
residual  for  the  layer  below  (2S0-200  mb)  is  plus 
78  meters,  while  the  layer  above  (200-150  mb)  has 
essentially  no  residual. 

The  other  CQCHT  checks  also  suggest  that  the 
200-mb  height  is  too  large.  Note,  for  example,  the 
relatively  large,  positive  increments  and  horizontal 
residuals  at  200, 150,  and  100  mb.  A  large  vertical 
residual  (50  meters)  occurs  at  200  mb  because  the 
error  is  probably  between  the  levels  used  to 
compute  the  residuals  (250  and  150  mb).  In 
contrast,  the  vertical  residuals  at  150  and  100  mb 
are  probably  small  because  errors  of  identical 
magnitude  occur  at  the  neighboring  levels.  This  is 
strong  evidence  that  the  200-mb  level  is  too  high, 
not  that  the  250-mb  level  is  too  low,  as  NUAV 
has  found.  Note  how  the  70-meter  computational 
error  at  200  mb  also  affects  every  layer  above  it. 
These  layers  are  in  hydrostatic  balance  (extremely 
small  hydrostatic  residuals);  they  probably  looked 
fine  to  NUAV,  but  the  increments  and  spatial 


checks  provided  CQCHT  with  enough  additional 
evidence  to  detect  the  error. 

N3—NUAV  only,  undetermined  result  In  most  of 
the  cases,  there  is  no  evidence  either  for  or  against 
the  NUAV  correction.  The  most  common  cause 
for  this  lack  of  evidence  is  missing  or  incorrect 
original  data  values.  As  mentioned  earlier, 
negative  rejected  height  values  are  a  common 
problem.  Occasionally,  more  than  one  original 
value  is  listed,  also  making  the  error  difficult  to 
evaluate.  In  certain  cases,  NUAV  generates 
mandatory  level  data  for  missing  levels.  CCX^HT 
never  “creates”  missing  data  in  this  manner.  This 
is  a  difficult  situation  to  classify  since  the  creation 
of  data  is  not  really  a  QC  function.  Because  this 
situation  is  relatively  uncommon  and  the 
“corrections”  generally  do  not  result  in  large 
errors,  these  cases  are  considered  undetermined. 

N4— NUAV  only,  unnecessary  correction.  Some  of 
the  NUAV  corrections  are  unusually  small;  a  1°  C 
temperature  correction  or  a  5-meter  height  change 
is  hard  to  justify.  These  corrections  would 
probably  cause  little  or  no  harm  in  the  operational 
database,  and  there  is  no  reason  for  making  them. 
Height  corrections  for  20  meters  or  less  and 
temperature  corrections  for  2°  C  or  less  are  placed 
in  this  category. 

3.4.  Other  Comparison  Methods.  A  less 
direct  method  of  comparison  is  to  simply  count  the 
total  number  of  errors  detected  by  each  algorithm. 
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As  discussed  earlier,  some  of  the  differences  in 
error  counts  may  be  due  to  the  fact  that  slightly 
different  data  is  processed  by  each  algorithm. 
Another  difficulty  is  that  NUAV  does  not  declare 
some  observations  to  be  in  error  but  ncx 
correctable  as  CQCHT  does.  In  addition,  negative 
rejected  height  values  in  the  NUAV  data  make 
some  observations  difficult  to  categorize. 

A  final  way  to  compare  algorithms  is  to  count  the 
number  of  stations  with  errors.  It  is  important  to 
make  the  distinction  between  the  number  of  errors 
and  the  number  of  stations  with  errors.  Since  a 


particular  station  may  have  errors  at  several 
different  levels,  the  number  of  errors  is  greater 
than  the  number  of  stations  with  errors.  Since  it’s 
difficult  to  determine  how  best  to  summarize 
errors,  statistics  on  both  counting  methods  are 
presented.  The  occurrence  of  a  computational  error 
in  layer  thickness  provides  a  good  example  of  how 
errors  can  be  counted  differently.  A  single  error 
in  computing  the  850-700  mb  thickness  leads 
to  identical  height  errors  at  every  level  above  850 
mb.  Cases  of  10  height  corrections  due  to  a  single 
computational  error  at  a  station  are  not 
uncommon. 
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4.  RESULTS 


4.1  Direct  Compartooa  A  dataset  containing 
only  stations/dates  in  which  both  algorithms 
detected  errors  was  created  (see  Section  3.2).  From 
this  dataset,  2S  stations  from  both  months  were 
picked  at  random  and  the  results  from  NUAV  and 
CQCHT  were  compared  manually.  Each  error  was 
evaluated  and  placed  in  one  of  the  categories 
discussed  in  Section  3.3.  Since  each  station  may 
have  had  more  than  one  error  on  a  particular  date 
and  time,  the  total  number  of  errors  was  greater 


than  SO.  Table  7  shows  the  results  of  this 
comparison  for  the  months  of  July  and  November 
1992.  An  average  of  only  2S.6  percent  of  the 
errors  was  detected  by  both  algorithms.  Of  these 
stations,  80.5  percent  of  the  corrections  were 
performed  equally  well  by  each  algorithm.  Most  of 
the  errors  in  this  group  were  very  large  and 
resulted  from  switched  temperature  signs  and 
mistyped  height  values. 


Table  7.  Direct  comparison  of  errors  at  50  randomly  eeteded  stations. 


Catagory 

July 

November 

Average 

Percent  of 
Total 

Both  pickBd,  CQCHT  better  (B1) 

3 

4 

3.5 

5.0 

Both  picked,  NUAV  better  (B2) 

0 

0 

0 

0.0 

Both  picked,  BOTH  good  (B3) 

16 

13 

14.5 

20.6 

CQCHT  only,  good  oon-.  (Cl) 

12 

24 

18.0 

25.5 

CQCHT  only,  bad  con-.  (C2) 

0 

0 

0 

0.0 

CQCHT  only,  NO  con’.  (C3) 

19 

14 

18.0 

23.4 

NUAV  only,  good  oorr.  (N1) 

4 

9 

6.5 

9.2 

NUAV  only,  bad  corr.  (N2) 

2 

1 

1.5 

2.1 

NUAV  only,  undetetmined  (N3) 

2 

9 

5.5 

7.8 

NUAV  only,  unneoessary  (N4) 

3 

6 

4.5 

6.4 

Cases  in  which  both  algorithms  detected  an  error 
at  the  same  date,  time,  station,  and  level  can  be 
compared  to  assure  that  identical  data  was 
processed.  For  this  reason,  errors  belonging  in  the 
“both”  group  are  studied  in  more  detail.  A  random 
sample  of  100  of  these  cases  was  chosen  from 
each  month  and  placed  in  the  categories  shown  in 
Table  8,  on  the  next  page.  The  same  techniques 


discussed  previously  are  used  to  categorize  the 
data.  As  in  Table  7,  not  enough  information  is 
available  with  the  NUAV  errors  to  place  errors  in 
a  “NUAV  better”  category.  If  the  corrections  made 
are  different  and  the  various  CQCHT  checks 
provide  strong  support,  it  is  assumed  that  the 
CQCHT  correction  is  better. 
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Table  8.  Direct  compariaon  d  arrore  datoctod  by  both  algorthma. 


1  Categoiy 

July 

Novambar 

^ - - ■ 

XAMvOmWQ 

Tampemiure,  Both  Good 

44 

54 

98  (49.0%) 

Temperalure,  CQCHT  batter 

20 

16 

36  (18.0%) 

Total  Tempantun  eiron 

64 

70 

134  (67.0%) 

Haii^  Both  Good 

21 

12 

33  (16.5%) 

HalQ^  CQCHT  batter 

15 

18 

33  (16.5%) 

Total  anon 

36 

30 

66  (33.0%) 

Al  arrors,  both  Good 

65 

66 

131  (65.5%) 

Al  anora,  CQCHT  batter 

35 

34 

69  (34.5%) 

Of  the  200  errors  examined,  6S.5  percent  were 
well  corrected  by  both  algorithms.  Most  of  these 
errors  were  quite  large  and  easily  detected  by  the 
hydrostatic  check  employed  by  both  algorithms. 
The  remaining  34.5  percent  of  the  errors  were 
corrected  more  accurately  by  CQCHT,  but  the 
difference  was  generally  not  very  large.  These 
percentages  suggest  that  the  hydrostatic  check  used 
by  NUAV  generally  performs  as  intended. 

The  additional  statistical  checks  performed  by 
CQCHT  allow  it  to  fine-tune  its  corrections  to  a 
higher  level  of  accuracy.  Temperature  errors 
accounted  for  67  percent  of  those  examined;  98  of 
134  (73.1%)  were  corrected  equally  well  by  both 
algorithms.  In  comparison,  only  SO  percent  of  the 
height  errors  were  corrected  equally  well  by  both. 
This  is  probably  because  most  of  the  temperature 
errors  are  fixed  simply  by  switching  the  sign, 
while  height  errors  are  more  complex.  Another 
cause  of  the  difference  is  the  criteria  used  to 
determine  when  NUAV  and  CQCHT  height  and 
temperature  values  are  essentially  equal.  This 
comparison  completely  ignores  the  errors  detected 
by  one  algorithm  because  of  its  inherent  strengths, 
as  well  as  the  errors,  and  missed  by  the  other 
because  of  certain  weaknesses. 


Although  simply  switching  the  sign  of  the 
temperature  corrects  many  errors,  this  is  not 
always  the  answer.  The  corrections  made  by 
NUAV  on  the  30  July/OOZ  data  from  Karachi, 
Pakistan,  (shown  in  Table  9)  illustrate  the 
problems  that  can  result  if  this  approach  is  not 
used  carefully.  For  unknown  reasons,  NUAV 
reverses  the  signs  of  the  temperatures  at  850  and 
700  mb,  leading  to  improbably  cold  readings  for 
Karachi  (24°  54’N,  elevation;  24  meters)  in  July. 
Table  9  provides  several  other  examples, 
Sprinagar,  Iran  (681  NM  from  Karachi)  reported 
a  7(X)-mb  temperature  of  plus  13.6°  C,  providing 
support  for  not  changing  the  original  temperature. 
Not  only  are  the  corrected  temperatures  too  low  in 
this  case,  but  the  resulting  vertical  temperature 
profile  is  very  unlikely.  Although  the  improper 
sign  on  a  temperature  is  a  common  error  detected 
frequently  by  both  algorithms,  NUAV  seems  to 
make  improper  corrections  fairly  often.  The  large 
errors  that  result  would  certainly  cause  the 
sounding  to  be  out  of  hydrostatic  balance,  but 
apparently  the  NUAV  hydrostatic  check  still 
cannot  detect  the  error.  This  type  of  error  could 
not  be  detected  in  the  CQCHT  output.  It’s 
important  to  note  that  most  of  these  errors  are 
much  smaller,  usually  less  than  10°  C. 
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Table  9.  Erroneoua  NUAV  temperature  corracdone. 


EMOCK  9mmjn  TwMVOm 

Data 

Laaal 

OUT 

NewT(”C)  1 

(mb) 

(•C) 

NUAV 

CQCHT  1 

Karachi.  PakMan 

30  Jul/OOZ 

850 

+25.4 

-25.4 

No  Change  \ 

700 

+13.0 

-13.0 

No  Change  | 

Bangalore,  hrwfla 

14  Jul/OOZ 

500 

-51.5 

+51.5 

-1.5 

Makung,  Taiwan 

22  JUI/12Z 

300 

-32.1 

No  Change 

No  change 

250 

-41.5 

No  Change 

No  Change 

200 

+31.0 

-31.0 

-51.5 

150 

-63.9 

No  Change 

No  Change 

Mena  Matmh,  Equatorial 
Guirtea 

25  Jul/12Z 

500 

+30.0 

-30.0 

o 

d 

400 

-13.1 

No  Change 

No  Change  | 

4.2  Total  Error  Count  A  less  direct  method  of 
comparison  is  to  simply  count  the  total  number  of 
errors  (not  the  number  of  stations  with  errors) 
detected  by  both  algorithms.  As  discussed  in 
Section  3.3,  differences  in  the  way  errors  are 
classified  leads  to  difficulties  in  comparing  the 
results.  The  negative  heights  encountered  in  the 
data  rejected  by  NUAV  is  probably  the  biggest 
problem.  These  errors  were  not  checked  manually, 
so  there  is  certainly  the  possibility  that  a  few  of 
the  corrections  are  bad. 


Table  10  shows  the  CQCHT  results  for  the  2 
months  studied.  The  total  number  of  errors  in  121 
time  periods  (OOZ  and  12Z;  62  in  July  and  58  in 
November)  and  the  average  number  of  errors  in 
each  period  are  grouped  by  height,  temperature, 
and  pressure.  Tables  A-l  and  A-2  in  the  Appendix 
show  this  information  for  July  and  November 
separately.  Most  errors  occur  in  height  values 
(about  85  each  period),  with  a  relatively  small 
number  of  pressure  errors  (about  6  each  period). 
An  average  of  80.6  corrections  and  62.7  error 
detections  are  made  each  period  by  CQCHT. 
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Table  10.  CQCHT  error  summary  (July  and  November  1992). 


Nunber  of  eirora 

■i 

Average  each  period 

Heig^ 

Temp 

Pleas 

D 

HeigM 

Temp 

Prase 

Al 

Corrections 

5,662 

3,475 

530 

9,667 

47.2 

29.0 

4.4 

80.6 

Data  bad,  no 
corrections. 

4,559 

2,800 

170 

7,529 

38.0 

23.3 

1.4 

62.7 

Total  errors 

10,221 

6,275 

700 

17,196 

85.2 

52.3 

5.8 

143.3 

Suspect,  but 
OK 

1,854 

1,449 

3,303 

15.5 

12.0 

12.0 

27.5 

Table  11.  NUAV  error  summary  (Jtdy  and  November  1992). 


Number  of  eirora 

Al 

Average  each  period 

Al 

Height 

Temp 

Press 

Height 

Temp 

Press 

Corrections 

1,982 

4,738 

128 

6,848 

16.4 

39.2 

1.1 

56.6 

No  Change 

226 

3,260 

0 

3,486 

1.9 

26.9 

0 

28.8 

Height  negative 

2,107 

.... 

.... 

2,107 

17.4 

.... 

.... 

17.4 

Old  value  missing 

9 

0 

0 

9 

0.1 

0 

0 

0.1 

Table  1 1  shows  the  total  number  of  errors 
corrected  by  NUAV  during  November  (59  periods) 
and  July  (62  periods)  1992.  Also  shown  are  cases 
with  negative  height  values  (or  less  than  -1,100 
meters  at  1 ,000  mb)  and  cases  for  which  there  was 
no  change.  Corrections  of  less  than  2.0°  C  or  10 
meters  are  categorized  as  “no  change.”  Tables  A-3 
and  A-4  in  the  Appendix  show  this  information  for 
July  and  November  separately.  The  total  number 
of  corrections  made  by  NUAV  averages  24.0 
fewer  than  CCJCHT  each  period.  Although 
CQCHT  corrects  many  more  height  errors  (47.2 
versus  16.4)  and  pressure  errors  (4.4  versus  1.1) 
each  period,  NUAV  leads  in  temperature 
corrections  with  10.1  more  corrections  a  period. 

The  counts  in  Table  1 1  must  be  viewed  with 
caution  because  each  one  of  the  6,848  corrections 


was  not  checked  manually  as  was  done  in  Section 
4.1.  Looking  back  at  Table  7,  we  see  that  52 
errors  were  corrected  by  NUAV,  not  including 
undetermined  and  unnecessary  corrections.  Of  this 
total,  seven  (13.5%)  were  corrected  better  by 
CQCHT  and  three  (5.8%)  were  bad  corrections. 
These  calculations  suggest  that  about  10  percent  of 
the  corrections  in  Table  11  could  be  of  poor 
quality  or  wrong. 

As  mentioned  previously,  there  is  a  possibility  that 
many  of  the  rejected  negative  heights  (i.e.,  those 
declared  to  be  in  error)  did  not  actually  indicate  a 
poor  correction.  Perhaps  a  good  correction  was 
made,  but  the  faulty  original  value  kept  this  fact 
from  being  discovered.  A  sample  of  50  cases  with 
rejected  negative  heights  was  compared  to  the 
original  CQCHT  values  to  see  if  NUAV  actually 
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made  a  correction  or  not.  In  this  sample,  40 
percent  of  the  heights  were  corrected;  in  other 
words,  the  new  value  generated  by  NUAV  was 
different  than  the  original  CQCHT  value  in  40 
percent  of  the  cases.  In  the  remaining  60 
percent,the  new  NUAV  value  was  identical  to  the 


original  CQCHT  valiw,  indicating  that  no  change 
was  made.  Assuming  these  percentages  are 
representative  of  all  2,107  values,  the  number  of 
height  corrections  would  be  increased  by  843  to 
2,825  or  23.3  per  period.  The  average  number  of 
total  corrections  per  period  would  increase  to  63.6. 


Table  12.  Numberof  stations  tMtthenore  (July  and  November  1992). 


Region  (block  #s) 

Stations  Checked 

Eirare  Detected 

Error  Percentage  f 

CQCHT 

NUAV 

CQCHT 

NUAV 

CQCHT 

NUAV 

Europe  (01-17) 

10,930 

10,340 

407 

832 

3.72 

8.05 

Fonner  USSR  (20-38) 

17,778 

17,631 

1,501 

1,488 

8.44 

8.44 

AalB  (40-41,  44-48) 

7,643 

7,616 

871 

637 

11.40 

8.36 

Inda  (42  and  43) 

3,256 

3,384 

1,400 

201 

43.00 

5.94 

CNna  (50-59) 

14,546 

14,543 

1,440 

19 

9.90 

0.13 

AMca  (60-68) 

3,856 

3,891 

625 

325 

16.21 

8.35 

N.  America  (70-74) 

14,566 

14,439 

285 

369 

1.96 

2.56 

Cea  America  (76,  78) 

2,039 

2,069 

201 

212 

9.86 

10.25 

S.Amei1ca,  Antarctica 
(80-89) 

3,005 

3,151 

532 

364 

17.70 

11.55 

Ausirala  arxl  Pacific 
(91-98) 

6,595 

6,134 

626 

416 

8.07 

6.78 

TOTALS 

84,214 

83,198 

7,888 

4,863 

9.37 

5.85 

4.3  Stations  with  Errors.  The  last  method  used  to 
compare  NUAV  and  CQCHT  is  to  compare  the 
number  of  stations  with  errors.  A  RAOB  with 
height  errors  at  every  level  from  850  to  100  mb 
and  a  few  temperature  errors  as  well,  is  only 
counted  as  one  error,  rather  than  nine.  Table  12 
shows  the  number  of  stations  with  errors  in  the 
months  of  July  and  November  combined.  Also 
shown  are  the  number  of  stations  checked  and  the 
percentage  of  stations  with  errors  (Tables  A-5  and 
A-6  in  the  Appendix  show  the  same  information 


separately  for  July  and  November).  The  number  of 
stations  checked  is  provided  in  the  CQCHT 
summary  generated  by  NMC  every  month.  The 
numbers  used  for  NUAV  are  actually  the  number 
of  complete  soundings  (up  to  100  mb)  going  into 
the  DATSAV2  database.  More  complete  data,  such 
as  the  number  of  part  A  and  C  sections  checked, 
is  not  available.  For  these  reasons,  consider  the 
number  of  stations  checked  by  NUAV  and  shown 
in  Table  12  approximate.  In  most  cases,  however, 
the  values  for  NUAV  and  CQCHT  ^ree  quite  closely. 


16 


The  most  dramatic  difference  between  the  two 
algorithms  occurs  in  the  China  data.  Although  the 
number  of  Chinese  stations  checked  is  nearly 
identical,  NUAV  finds  only  a  fraction  of  the  errors 
detected  by  CQCHT.  More  observational  or 
measurement  errors  occur  in  data  from  Third 
World  nations  due  to  equipment  problems;  this 
type  of  error  is  missed  by  the  hydrostatic  check. 
CQCHT  is  able  to  detect  more  of  these  errors  with 
its  additional  checks.  This  fact  may  explain  a 
portion  of  the  difference,  but  hydrostatic  errors  are 
still  the  most  common  in  China;  more  than  19  of 
these  almost  certainly  occurred.  lt*s  hard  to 
believe  that  North  America  had  an  error 
percentage  rate  20  times  higher  than  China, 
especially  in  light  of  the  relative  agreement  among 
the  algorithms  in  Asia,  the  former  USSR,  North 
America,  Central  America,  and  Australia.  Possibly 
some  aspect  of  NUAV  prevents  all  the  data  from 
being  checked,  although  there  is  no  direct  evidence 
for  this.  China  produces  relatively  reliable 
upper-air  data  compared  to  India  where,  according 
to  CQCHT,  an  astounding  43  percent  of  the 
soundings  have  errors. 


CQCHT  results  have  found  that  most  regions  of 
the  world  have  fewer  observational  errors  than 
errors  detected  by  the  hydrostatic  check  (Morone 
et  al.,  1992).  The  exception  is  India,  where  more 
than  twice  as  many  errors  are  observational.  This 
fact  may  help  explain  the  huge  difference  in  errors 
detected  in  India  by  CQCHT  and  NUAV,  but  the 
same  problem  that  affects  the  data  from  China 
may  play  a  role.  It  certainly  seems  unlikely  that 
Europe,  with  its  mostly  automated  RAOB  network, 
has  a  significantly  higher  error  rate  than  India’s. 

It  is  only  among  the  high-quality  soundings  of 
North  America  and  Europe  that  NUAV  detects  a 
much  higher  percentage  of  errors  than  CQCHT.  As 
has  been  shown,  the  poor  corrections  and 
unnecessary  corrections  in  the  NUAV  data 
probably  raise  the  error  numbers.  Although  some 
inaccuracies  in  the  counts  of  stations  processed 
surely  exist,  it’s  probable  that  the  higher 
worldwide  percentages  obtained  by  CQCHT 
actually  reflect  the  presence  of  additional  statistical 
checks  in  the  NMC  algorithm. 


S.  CONCLUSIONS. 


This  study  shows  NMC*s  CQCHT  to  be  a  better 
QC  algorithm  than  AFGWC's  NUAV. 

NUAV,  which  became  operational  on  22 
December  1986,  relies  primarily  on  the  hydrostatic 
check  to  detect  errors.  The  hydrostatic  equation 
provides  a  very  powerful  constraint  on  heights  and 
temperatures,  but  it  cannot  be  used  to  detect 
observational  errors — those  that  occur  before  the 
data  is  processed  at  the  reporting  station.  If  a 
broken  sensor  gives  temperature  readings  that  are 
off  by  a  few  degrees,  these  readings  will  be  used 
in  the  hydrostatic  equation  to  compute  the 
mandatory  heights,  thereby  giving  incorrect  height 
values.  When  this  data  is  quality-controlled,  the 
hydrostatic  check  will  not  find  an  error,  and  the 
errors  may  be  too  small  to  be  detected  by 
NUAV’s  gross  error  check. 

In  contrast,  CQCHT  uses  quantitative  increment, 
horizontal,  and  vertical  checks  to  detect  errors. 
CQCHT  became  operational  in  November  1991 
and  employs  the  latest  techniques  in  automated 
QC.  NMC  produces  monthly  summaries  of  QC 
results  that  are  continually  monitored  for  any 
CQCHT  problems  or  chronic  data  problems  at  any 
one  station.  NUAV  essentially  employs  the  same 
techniques  used  by  the  previous  generation  of 
NMC  QC  algorithms  (CHQC  in  1989). 

A  direct  comparison  of  50  randomly  selected 
stations  found  CQCHT  alone  detected  48.9  percent 
of  the  total  errors.  Both  algorithms  detected  25.6 
percent  of  the  errors,  while  NUAV  alone  detected 
25.5  percent.  Of  the  errors  found  only  by  NUAV, 
33.3  percent  were  either  bad  or  unnecessary.  When 
both  algorithms  corrected  the  same  observation, 
the  CQCHT  correction  was  better  19.4  percent  of 
the  time,  with  comparable  corrections  being  made 
on  the  rest  of  the  observations.  Although  the 
categories  used  in  this  section  were  determined 
subjectively,  and  the  lack  of  available  NUAV  data 
made  categorizing  NUAV  corrections  difficult,  the 


amount  of  quantitative  evidence  provided  with 
each  CQCHT  correction  made  confident 
categorizations  possible. 

A  comparison  of  the  total  number  of  errors 
corrected  by  each  algorithm  found  that  CQCHT 
made  an  average  of  24  more  corrections  each 
period  than  NUAV  (2,819  more  corrections) 
during  the  2  months  studied.  There  were 
uncertainties  in  this  comparison,  however,  because 
the  large  number  of  rejected  height  values  that  are 
negative  make  these  cases  difficult  to  classify. 
Using  estimates  suggesting  that  about  40  percent 
of  these  cases  are  actually  corrections  still  leaves 
NUAV  1,976  corrections  short  of  CfJCHT’s 
performance  over  a  2-month  period.  In  addition  to 
corrections,  CQCHT  also  detects  7,529 
uncorrectable  errors  during  the  study  period.  These 
errors  are  either  rejected  or  assimilated  with 
reduced  weight. 

NUAV  detected  3,025  fewer  stations  with  errors 
during  the  study  period.  It  should  be  noted  that  the 
error  counts  made  by  NUAV  for  China  and  India 
are  dramatically  lower  than  those  made  by 
CQCHT.  It  is  possible  that  a  NUAV  problem 
peculiar  to  stations  in  China  and  India  leads  to  the 
large  difference  in  these  areas. 

While  the  evidence  supporting  CQCHT  as  the 
more  advanced  QC  algorithm  is  strong,  this  study 
does  not  address  other  factors  which  must  also  be 
considered  before  deciding  to  receive  QC  data 
from  NMC.  For  example,  will  differing  data  cutoff 
times  allow  all  the  data  of  interest  to  AFGWC  to 
get  into  the  database?  The  costs  of  updating 
NUAV,  if  that  option  were  pursued,  may  be  much 
greater  than  those  associated  with  receiving 
CQCHT  data.  The  degree  of  monitoring  performed 
on  each  algorithm  is  another  important 
consideration.  Further  discussions  with  scientists  at 
NMC  are  required  before  these  issues  can  be 
resolved  completely. 
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APPENDIX 


Comparison  Tabios  (CQCHT  varaua  NUAV) 
«July  and  No¥ambar 

Table  A-1.  CQCHT  error  summary  for  July  1992. 


NimlMf  of  #fEQfS 

■ 

Tomp 

Pmm 

Al 

Tamp 

IBf!l 

D 

COfTMtlOIW 

2,817 

1,538 

264 

4,620 

46.4 

24J 

4.3 

74.5 

DMi  badL  no  oofractam 

2,118 

1,241 

80 

3,438 

34,2 

20.0 

1.3 

555 

ToMomra 

^780 

344 

8,059 

79.6 

44.8 

5.5 

130.0 

SuipoctbulOK 

836 

695 

3 

1,531 

13.5 

11.2- 

24.7 

Table  A-2.  NUAV  error  summary  for  July  1992. 


Numbororomn 

Al 

A¥ifi9i  Mdi  pMtod 

Al 

HoIgM 

Tomp 

EHI 

B!S9i 

ConvctfofiB 

1,015 

2,259 

97 

3,371 

16.4 

36.4 

1.6 

54.4 

NoCliongo 

113 

1,585 

0 

1,698 

1.8 

25.6 

0 

27.4 

nVIQIW  IWQBUW 

1,045 

— 

— 

1,045 

16.9 

— 

— 

16.9 

OUvahM  miMing 

5 

0 

0 

5 

0.1 

0 

n 

0.1 

Table  A-3.  CQCHT  anor  summary  for  November  1992. 


Ntantar  ofamm 

Al 

Avaraga  aacb  padod 

Al 

Haimi 

Tamp 

HaIgM 

Tamp 

Praaa 

Convclim 

2.845 

1,936 

266 

5,047 

49.1 

.6 

87.0 

0MB  tad,  no  car. 

2,441 

1,559 

90 

4,090 

42.1 

26.9 

1.6 

70.5  1 

Total  ami* 

5,286 

3,495 

356 

9,137 

91.1 

60.3 

6.1 

157.5  1 

Suapact,  but  OK 

1,018 

754 

! 

1,772 

17.6 

13.0 

— 

30.6  1 

Table  A4  NUAV  error  aummaiy  for  Nowember  1992. 


Numtar  of  afroia 

Al 

Avaraga  aacii  padod 

Al 

HaIgM 

Tamp 

B9 

HaIgM 

Tamp 

E9 

Comcdona 

967 

2,479 

31 

3,477 

16.4 

42.0 

Hi 

58.9 

NoCtanga 

113 

1,675 

0 

1,788 

1.9 

28.4 

0 

30.3 

Hai(^  nagatlva 

1062 

— 

— 

1.062 

18.0 

— 

18.0 

Old  vBhw  mlaiing 

4 

0 

0 

4 

0.1 

0 

0 

0.1 
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Tabis  Number  of  Statfona  wNh  eirora  in  July.  Figures  are  given  in  total  stations  and  percentage 
of  stations  with  errors  in  the  regions  shown. 


Region  (bkMk  •«) 

Statfona 

EirofS 

1 

CQCHT 

- 1 

NUAV 

COCHT 

NUAV 

CQCHT 

NUAV  I 

Eurapo  (01-17) 

5,583 

5,440 

193 

400 

3.46 

.3, 

Fomwr  USSR  (20.38) 

8,626 

8,807 

663 

784 

7.69 

mm 

Aato  («M1. 44-48) 

3,775 

3,819 

428 

342 

11.34 

8.96  I 

Inda  (42  and  43) 

1,638 

1,727 

745 

125 

45.48 

754  B 

China  (50.69) 

7,476 

7,389 

745 

9 

9.97 

0.12 

MHea  (60.68) 

2,077 

2,113 

343 

185 

16.51 

8.76 

North  Amarica  (70.74) 

7,371 

7,285 

155 

179 

2.10 

2.46 

Canhal  Amartca  (76,  78) 

1,075 

1,091 

108 

120 

10.00 

11.00 

South  Amartea,  Aniaicdca 
(80.80) 

1,453 

1,554 

273 

180 

18.79 

11.58 

Auattala  and  Pacific 
(91-98) 

3,062 

3,019 

180 

119 

5.88 

3.94 

TOTAL 

42,136 

42544 

3,833 

2,443 

9.10 

5.78 

Table  A*6  Number  of  Stations  with  errors  in  November.  Figures  are  given  in  total  stations  and 
percentage  of  stations  with  errors  in  the  regions  shown. 


Stations 

Emrs 

Encr  Parcantagt 

Ragton  (block  fa) 

CQCHT 

NUAV 

CQCHT 

NUAV 

CQCHT 

NUAV 

Eunjpe  (01-17) 

5,347 

4,900 

214 

432 

4.00 

8.62 

Former  USSR  (20-38) 

9,152 

8,824 

838 

704 

9.16 

.98 

Asia  (40-41, 44-48) 

3,868 

3,797 

443 

295 

11.45 

7.77 

Indta  (42  and  43) 

1,618 

1,657 

655 

76 

40.48 

4.59 

China  (50-59) 

7,070 

5,781 

695 

10 

9.83 

0.17 

AMca(60e8) 

1,779 

1,778 

282 

140 

15.85 

7.87 

North  Amarica  (70-74) 

7,195 

7,154 

130 

190 

1.81 

2.66 

Caitfral  Anwrica  (76,  78) 

964 

978 

93 

92 

9.65 

9.41 

South  America,  Antarctica 
(80-89) 

1,552 

1,597 

259 

184 

16.69 

11.52 

Australa  and  PacMc 
(91-68) 

3,533 

3,115 

446 

297 

12.62 

9.53 

TOTAL 

42,078 

40,954 

4,057 

2,420 

9.64 

5.91 
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